Skip to main content

Observability & Monitoring

Complete guide for monitoring and observability setup for Exivity on Kubernetes.

Third-party observability

Prometheus, Prometheus Operator, and Grafana are third-party observability products. Exivity provides metrics endpoints, dashboard examples, and alert examples, but you are responsible for operating and supporting your monitoring platform.

Exivity Health Monitoring

Exivity provides built-in health monitoring capabilities through:

Built-in Health Endpoints

Every Exivity service uses health endpoints at /healthz on port 8000 for Kubernetes probes:

  • Liveness probes - Detect if the service needs restarting
  • Readiness probes - Determine if the service can accept traffic
  • Metrics endpoints - Expose Prometheus-compatible metrics at /metrics when prometheus.metricServer.enabled is set to true

Exivity-Specific Metrics

Exivity exposes these custom metrics for monitoring:

Metric NameDescriptionValuesUsage
📈 exivity_upService health status1 (healthy), 0 (down)Overall service availability
📈 exivity_command_healthyIndividual command health1 (healthy), 0 (unhealthy)Component-level monitoring
💾 exivity_nfs_dir_writableNFS directory write status1 (writable), 0 (not writable)Storage health monitoring

Configuration

Enable Exivity Monitoring

Configure monitoring in your values.yaml:

# Enable health probes (enabled by default)
probes:
livenessProbe:
enabled: true
initialDelaySeconds: 3
periodSeconds: 30
failureThreshold: 120
readinessProbe:
enabled: true
initialDelaySeconds: 3
periodSeconds: 30
failureThreshold: 60

# Enable Prometheus metrics collection
prometheus:
metricServer:
enabled: true
serviceMonitor:
enabled: true # Creates ServiceMonitor for prometheus-operator

Monitoring Dashboard and Alerts

Exivity provides ready-to-use monitoring configurations for Kubernetes deployments using Prometheus and Grafana. This allows you to monitor service health, NFS storage, and readiness directly in your cluster.

Grafana Dashboard

A ready-to-use Grafana dashboard is provided:

How to use:

  1. Import this JSON file into your Grafana instance (see Grafana import docs)
  2. The dashboard visualizes Exivity service health, NFS writability, and command status using Prometheus metrics.

Prometheus Alert Rules

A set of Prometheus alert rules is provided for Exivity:

  • File: readiness-probe.rules.yaml (download)
  • Alerts included:
    • ServiceDown - Critical alert when exivity_up == 0 for 10 minutes
    • NfsDirNotWritable - Critical alert when exivity_nfs_dir_writable == 0 for 10 minutes
    • CommandHealthy - Critical alert when exivity_command_healthy == 0 for 10 minutes

How to use:

  1. Add this YAML file to your Prometheus alerting rules configuration

Requirements

  • Prometheus must be scraping the Exivity metrics endpoints
  • Prometheus Operator must be installed if you enable prometheus.metricServer.serviceMonitor.enabled
  • Grafana must be connected to your Prometheus data source
  • For more details on setup, see the Prometheus and Grafana documentation